1
超越基础搜索:解决语义相似性的局限性
AI010Lesson 8
00:00

超越相似性

“80%问题”基础语义搜索在简单查询时表现良好,但在边缘情况却会失效。仅通过相似性进行搜索时,向量存储通常返回数值上最接近的片段。然而,如果这些片段几乎完全相同,大语言模型(LLM)将接收到冗余信息,浪费有限的上下文窗口,并错失更广阔的视角。

高级检索支柱

  1. 最大边际相关性(MMR):与仅选择最相似的项目不同,MMR 在相关性与多样性之间取得平衡,以避免重复。
    $$MMR = \text{argmax}_{d \in R \setminus S} [\lambda \cdot \text{sim}(d, q) - (1 - \lambda) \cdot \max_{s \in S} \text{sim}(d, s)]$$
  2. 自查询:利用大语言模型(LLM)将自然语言转化为结构化元数据过滤器(例如按“第3讲”或“来源:PDF”筛选)。
  3. 上下文压缩:压缩检索到的文档,仅提取与查询相关的“高价值”片段,从而节省令牌。
冗余陷阱
向大语言模型提供同一段落的三个版本并不会让它变得更智能——只会让提示词变得更昂贵。多样性是实现“高价值”上下文的关键。
retrieval_advanced.py
TERMINALbash — 80x24
> Ready. Click "Run" to execute.
>
Knowledge Check
You want your system to answer "What did the instructor say about probability in the third lecture?" specifically. Which tool allows the LLM to automatically apply a filter for { "source": "lecture3.pdf" }?
ConversationBufferMemory
Self-Querying Retriever
Contextual Compression
MapReduce Chain
Challenge: The Token Limit Dilemma
Apply advanced retrieval strategies to solve a real-world constraint.
You are building a RAG system for a legal firm. The documents retrieved are 50 pages long, but only 2 sentences per page are actually relevant to the user's specific query. The standard "Stuff" chain is throwing an OutOfTokens error because the context window is overflowing with irrelevant text.
Step 1
Identify the core problem and select the appropriate advanced retrieval tool to solve it without losing specific nuances.
Problem: The context window limit is being exceeded by "low-nutrient" text surrounding the relevant facts.

Tool Selection:ContextualCompressionRetriever
Step 2
What specific component must you use in conjunction with this retriever to "squeeze" the documents?
Solution: Use an LLMChainExtractor as the base for your compressor. This will process the retrieved documents and extract only the snippets relevant to the query, passing a much smaller, highly concentrated context to the final prompt.